Nature Human Behaviour — Latest Matching Preprints

1

DfE-DB: A systematic database of 3.8 million human decisions across multiple experience-based tasks

Yang, Y.; Spektor, M.; Thoma, A. I.; Hertwig, R.; Wulff, D. U.

2025-12-16 neuroscience 10.64898/2025.12.12.693971 medRxiv

Top 0.1%

59.8%

Show abstract

Learning from experience is central to human decision making, yet research on experience-based choice remains fragmented across paradigms and disciplines. We present the Decision-from-Experience Database (DfE-DB), a standardized, openly accessible resource comprising 3.8 million trial-level decisions from 11,921 participants across 168 studies and 13 paradigms. By harmonizing raw behavioral data and classifying studies along 13 key design features, the database enables quantitative comparisons previously obscured by heterogeneous task and data structures. Using this resource, we show that choice tendencies--toward higher risk, expected value, or experienced mean--vary substantially across paradigms and are strongly shaped by core design features such as feedback type, outcome structure, stationarity, and sampling. These features explain substantial cross-study variability and reveal underexplored paradigm variants. DfE-DB provides the empirical infrastructure necessary to test the generality of behavioral phenomena and computational models, fostering a more integrated science of decisions from experience.

2

Unmasking Seasonal Cycles in Human Fertility: How holiday sex and fertility cycles shape birth seasonality

Symul, L.; Hsieh, P.; Shea, A.; Moreno, C. R. d. C.; Skene, D.; Holmes, S.; Martinez, M. E.

2020-11-19 sexual and reproductive health 10.1101/2020.11.19.20235010 medRxiv

Top 0.1%

53.8%

Show abstract

The mechanisms of human birth seasonality have been debated for over 150 years1. In particular, the question of whether sexual activity or fertility variations drive birth seasonality has remained open and challenging to test without large-scale data on sexual activity 2,3. Analyzing data from half-a-million users worldwide collected from the female health tracking app Clue combined with birth records, we inferred that birth seasonality is primarily driven by seasonal fertility, yet increased sexual activity around holidays explains minor peaks in the birth curve. Our data came from locations in the Northern Hemisphere (UK, US, and France) and the Southern Hemisphere (Brazil). We found that fertility peaks between the autumn equinox and winter solstice in the Northern Hemisphere locations and shortly following the winter solstice in the Southern Hemisphere locations.

3

Bridging Big Data: Procedures for Combining Non-equivalent Cognitive Measures from the ENIGMA Consortium

Kennedy, E.; Vadlamani, S.; Lindsey, H. M.; Lei, P.-W.; Pugh, M. J.; Adamson, M.; Alda, M.; Alonso-Lana, S.; Ambrogi, S.; Anderson, T. J.; Arango, C.; Asarnow, R.; Avram, M.; Ayesa-Arriola, R.; Babikian, T.; Banaj, N.; Bird, L. J.; Borgwardt, S.; Brodtmann, A.; Brosch, K.; Caeyenberghs, K.; Calhoun, V. D.; Chiaravalloti, N. D.; Cifu, D. X.; Crespo-Facorro, B.; Dalrymple-Alford, J. C.; Dams-O'Connor, K.; Dannlowski, U.; Darby, D.; Davenport, N.; DeLuca, J.; Diaz-Caneja, C. M.; Disner, S. G.; Dobryakova, E.; Ehrlich, S.; Esopenko, C.; Ferrarelli, F.; Frank, L. E.; Franz, C.; Fuentes-Claramonte,

2023-01-19 neuroscience 10.1101/2023.01.16.524331 medRxiv

Top 0.1%

44.3%

Show abstract

Investigators in neuroscience have turned to Big Data to address replication and reliability issues by increasing sample sizes, statistical power, and representativeness of data. These efforts unveil new questions about integrating data arising from distinct sources and instruments. We focus on the most frequently assessed cognitive domain - memory testing - and demonstrate a process for reliable data harmonization across three common measures. We aggregated global raw data from 53 studies totaling N = 10,505 individuals. A mega-analysis was conducted using empirical bayes harmonization to remove site effects, followed by linear models adjusting for common covariates. A continuous item response theory (IRT) model estimated each individuals latent verbal learning ability while accounting for item difficulties. Harmonization significantly reduced inter-site variance while preserving covariate effects, and our conversion tool is freely available online. This demonstrates that large-scale data sharing and harmonization initiatives can address reproducibility and integration challenges across the behavioral sciences. TeaserWe present a global effort to devise harmonization procedures necessary to meaningfully leverage big data.

4

Humans use a dual information-seeking policy to improve noisy inferences outside the explore-exploit tradeoff

Cao, Y.; Almeras, C.; Lee, J. K.; Wyart, V.

2025-11-16 neuroscience 10.1101/2025.10.08.681186 medRxiv

Top 0.1%

40.3%

Show abstract

Everyday decisions aim not only to earn rewards but also to learn about the world. Across three studies (total N = 702), we examined how people gather epistemic information stripped of rewarding value, and compared their strategy with reward seeking in otherwise matched conditions. Computational modeling of human behavior revealed a two-stage information-seeking policy, where participants first repeatedly sample each novel option in turn to test provisional hypotheses, a process we call streaking, before transitioning to uncertainty-guided exploration. While artificial neural networks trained to optimize inference accuracy acquired uncertainty-guided exploration but not early streaking, this two-stage policy improves human inference accuracy under noisy belief updating. Streaking and uncertainty-guided exploration tend to be co-expressed in the same individuals but map onto distinct psychological traits. Together, these results offer a novel account of human information seeking, clarifying its motives and benefits in epistemic contexts beyond the reward-centric explore-exploit tradeoff.

5

COCO-Search18: A Dataset for Predicting Goal-directed Attention Control

Chen, Y.; Yang, Z.; Ahn, S.; Samaras, D.; Hoai, M.; Zelinsky, G.

2020-09-04 animal behavior and cognition 10.1101/2020.07.27.221499 medRxiv

Top 0.1%

39.7%

Show abstract

Attention control is a basic behavioral process that has been studied for decades. The currently best models of attention control are deep networks trained on free-viewing behavior to predict bottom-up attention control--saliency. We introduce COCO-Search18, the first dataset of laboratory-quality goal-directed behavior large enough to train deep-network models. We collected eye-movement behavior from 10 people searching for each of 18 target-object categories in 6202 natural-scene images, yielding[~] 300,000 search fixations. We thoroughly characterize COCO-Search18, and benchmark it using three machine-learning methods: a ResNet50 object detector, a ResNet50 trained on fixation-density maps, and an inverse-reinforcement-learning model trained on behavioral search scanpaths. Models were also trained/tested on images transformed to approximate a foveated retina, a fundamental biological constraint. These models, each having a different reliance on behavioral training, collectively comprise the new state-of-the-art in predicting goal-directed search fixations. Our expectation is that future work using COCO-Search18 will far surpass these initial efforts, finding applications in domains ranging from human-computer interactive systems that can anticipate a persons intent and render assistance to the potentially early identification of attention-related clinical disorders (ADHD, PTSD, phobia) based on deviation from neurotypical fixation behavior.

6

Underappreciated role of environmental enrichment in alleviating depression and anxiety: Quantitative evidence synthesis of rodent models

Yang, Y.; Liu, M.; Morrison, K.; Lagisz, M.; Nakagawa, S.

2025-06-04 animal behavior and cognition 10.1101/2025.06.02.657339 medRxiv

Top 0.1%

38.2%

Show abstract

Environmental enrichment has long been recognized as a non-pharmacological intervention to mitigate mental health issues, yet its efficacy, and heterogeneity of treatment effects across experimental contexts remain underexplored. Heterogeneity of treatment effects, which reflects variability in individual responses to interventions, is a critical factor in determining the generalizability and personalization needs of treatments. Here, we conducted a registered meta-analysis of 62 studies and 1,112 comparisons in rodent models to evaluate the impact of environmental enrichment on depressive and anxiety-like behaviours. We found that environmental enrichment reduced these behaviours of animal models by 16% on average and decreased inter-individual variability by 12%, indicating not only effectiveness but also low heterogeneity of treatment effects, which suggests consistent effects across individuals. Environmental enrichment further nullified the adverse effects of stressors, demonstrating a significant antagonistic interaction. These effects were robust across multiple sensitivity analyses, including model-based predictions, post-stratification, multi-model inference, publication bias correction, and critical appraisal of study quality. Moderator analyses highlighted the importance of exposure timing and the inclusion of social enrichment components. Taken together, our pre-clinical evidence on rodent models supports environmental enrichment as a low-cost, scalable, and biologically grounded intervention with translational relevance for developing equitable and accessible treatments for depression and anxiety. Given the importance of innovation and personalization in mental health care, the low heterogeneity of treatment effects of environmental enrichment positions it as a promising avenue for non-pharmacological therapeutic strategies that can be broadly applied without extensive tailoring.

7

Genome-Wide Association Studies of Coffee Intake in UK/US Participants of European Ancestry Uncover Gene-Cohort Influences

Thorpe, H. H.; Fontanillas, P.; Pham, B.; Meredith, J. J.; Jennings, M. V.; Courchesne-Krak, N. S.; Vilar-Ribo, L.; Bianchi, S. B.; Mutz, J.; 23andMe Research Team, ; Elson, S. L.; Khokhar, J. Y.; Abdellaoui, A.; Davis, L. K.; Palmer, A. A.; Sanchez-Roige, S.

2023-09-11 genetic and genomic medicine 10.1101/2023.09.09.23295284 medRxiv

Top 0.1%

37.8%

Show abstract

Coffee is one of the most widely consumed beverages. We performed a genome-wide association study (GWAS) of coffee intake in US-based 23andMe participants (N=130,153) and identified 7 significant loci, with many replicating in three multi-ancestral cohorts. We examined genetic correlations and performed a phenome-wide association study across thousands of biomarkers and health and lifestyle traits, then compared our results to the largest available GWAS of coffee intake from UK Biobank (UKB; N=334,659). The results of these two GWAS were highly discrepant. We observed positive genetic correlations between coffee intake and psychiatric illnesses, pain, and gastrointestinal traits in 23andMe that were absent or negative in UKB. Genetic correlations with cognition were negative in 23andMe but positive in UKB. The only consistent observations were positive genetic correlations with substance use and obesity. Our study shows that GWAS in different cohorts could capture cultural differences in the relationship between behavior and genetics.

8

FOODEEG: An open dataset of human electroencephalographic and behavioural responses to food images

Chae, V. J.; Grootswagers, T.; Bode, S.; Feuerriegel, D.

2025-11-10 neuroscience 10.1101/2025.11.07.687287 medRxiv

Top 0.1%

34.3%

Show abstract

Investigating the neurocognitive mechanisms underlying food choices has the potential to advance our understanding of eating behaviour and inform health-targeted interventions and policy. Large, publicly available neural and behavioural datasets can enable new discoveries and targeted hypothesis tests, yet no such datasets are currently available. We present the FOODEEG dataset containing electroencephalographic (EEG) responses to a diverse array of food images, as well as behavioural measures of food cognition (food categorisation task, food go/no-go task, and food choice task responses), collected from 117 participants. We also provide normative ratings for the food image stimuli with respect to 22 food attributes, including nutritive, hedonic and taste properties, familiarity, and elicited emotions. Our dataset also includes questionnaire-based measures of participants food motivations, dietary styles, and general motivational tendencies. In the validation analyses, we demonstrate that early food-evoked EEG responses in our dataset are consistent with observations in previous work. The FOODEEG dataset will be valuable for accelerating research into the neural substrates of visual food processing, dietary decisions, and individual differences.

9

What predicts adherence to COVID-19 government guidelines? Longitudinal analyses of 51,000 UK adults.

Wright, L.; Steptoe, A.; Fancourt, D.

2020-10-21 epidemiology 10.1101/2020.10.19.20215376 medRxiv

Top 0.1%

33.0%

Show abstract

In the absence of a vaccine, governments have focused on social distancing, self-isolation, and increased hygiene procedures to reduce the transmission of SARS-CoV-2 (COVID-19). Compliance with these measures requires voluntary cooperation from citizens. Yet, compliance is not complete, and existing studies provide limited understanding of what factors influence compliance; in particular modifiable factors. We use weekly panel data from 51,000 adults across the first three months of lockdown in the UK to identify factors that are related to compliance with COVID-19 guidelines. We find evidence that increased confidence in government to tackle the pandemic is longitudinally related to higher compliance, but little evidence that factors such as mental health and wellbeing, worries about future adversities, and social isolation and loneliness are related to changes in compliance. Our results suggest that to effectively manage the pandemic, governments should ensure that confidence is maintained, something which has not occurred in all countries.

10

Charitability, Compulsion, & the Cost of Control

Arroyo, L.; Liljeholm, M.

2025-05-11 animal behavior and cognition 10.1101/2025.05.07.652768 medRxiv

Top 0.1%

27.9%

Show abstract

Human decision-makers have a well-established preference for controllable environments. We combined a hierarchical gambling task with cross-sectional administration of psychometric surveys and computational cognitive modeling, to assess whether this preference extends to contexts in which decision outcomes benefit others, specifically charitable organizations. In neurotypical individuals (n=100), there was a dramatic reduction in the preference for free choice across self- and charity-benefiting gambling contexts when freely chosen options yielded divergent outcome distributions - i.e., when free choice afforded control over decision outcomes. This selective modulation is consistent with a cost-benefit analysis, trading cognitive effort and controllability gains against the utilities of earning money for oneself vs. for a charity. When the same task was administered to individuals with self-reported obsessive-compulsive disorder (OCD; n=108) the preference for control over decision outcomes was preserved across self- and charity-benefiting contexts, consistent with a responsibility-OCD subtype and with an excessive subjective utility of control more generally.

11

Explicit memory representations in decisions from experience

Mason, A.; Lindskog, M.; Hertwig, R.; Wulff, D.

2025-10-28 neuroscience 10.1101/2025.10.27.684917 medRxiv

Top 0.1%

27.7%

Show abstract

Reinforcement learning (RL) models explain how people adapt behavior through incremental value updates but assume that individual experiences are not explicitly stored or retrieved. Across two experiments (N = 282 and N = 1,818), we tested whether people rely on such explicit memory representations during experience-based choice. Participants sampled outcomes from two lotteries and, in an "ignore" condition, were instructed to disregard specific outcomes before deciding. Ignoring these outcomes substantially altered preferences, suggesting that choices were guided by explicit episodic representations rather than cumulative reinforcement. Frequency judgments revealed generally accurate memory for experienced outcomes, but reduced precision for continuous compared with discrete outcome distributions. These findings challenge purely incremental RL accounts and support theories proposing that human choice integrates flexible episodic memory with reinforcement mechanisms, bridging models of learning, memory, and decision-making. Statement of RelevanceHow people learn from experience shapes decisions in many everyday situations, from choosing a stock to invest in to selecting a restaurant or deciding which route to take home. Our experiments show that people do not rely solely on incremental value updates but instead exploit memory representations of past outcomes to guide choices. When asked to ignore certain outcomes, participants adjusted their preferences in line with these representations. These findings reveal that human decision-making also engages flexible use of memory for past experiences, highlighting the importance of memory-based mechanisms in adaptive, real-world choice behaviour.

12

History biases reveal novel dissociations between perceptual and metacognitive decision-making

Benwell, C. S. Y.; Beyer, R.; Wallington, F.; Ince, R. A. A.

2019-08-19 neuroscience 10.1101/737999 medRxiv

Top 0.1%

27.7%

Show abstract

Human decision-making and self-reflection often depend on context and internal biases. For instance, decisions are often influenced by preceding choices, regardless of their relevance. It remains unclear how choice history influences different levels of the decision-making hierarchy. We employed analyses grounded in information and detection theories to estimate the relative strength of perceptual and metacognitive history biases, and to investigate whether they emerge from common/unique mechanisms. Though both perception and metacognition tended to be biased towards previous responses, we observed novel dissociations which challenge normative theories of confidence. Different evidence levels often informed perceptual and metacognitive decisions within observers, and response history distinctly influenced 1st (perceptual) and 2nd (metacognitive) order decision-parameters, with the metacognitive bias likely to be strongest and most prevalent in the general population. We propose that recent choices and subjective confidence represent heuristics which inform 1st and 2nd order decisions in the absence of more relevant evidence.

13

The carbon footprint of science when it fails to self-correct

Farley, M.; Munafo, M. R.; Lewis, A.; Nicolet, B. R.

2025-04-22 scientific communication and education 10.1101/2025.04.18.649468 medRxiv

Top 0.1%

26.8%

Show abstract

Science is - in principle - self-correcting, but there is growing evidence that such self-correction can be slow, and that spurious findings continue to drive research activity that is no longer justified. Here we highlight the environmental impact of this failure to self-correct sufficiently rapidly. We identified a non-fraudulent occurrence of irreproducible findings: the literature on the association between genetic variation in the serotonin transporter gene (5-HTTLPR) with anxiety and depression. An initial report in 1996 found evidence for an association, but a study as early as 2005 that was three orders of magnitude larger found no evidence for an association. However, studies investigating this association continue to be published. We isolated 1,183 studies published between 1996 and 2024 that investigated the association and calculated an estimated carbon footprint of these studies. We estimate that the failure to self-correct had a footprint of approximately 30,068 tons of CO2 equivalent. Our aim is to present a case study of the potential carbon footprint of research activity that is no longer justified, when a theory is disproven. We highlight the importance of integrating self-correction mechanisms within research, and embracing the need to discontinue unfruitful lines of enquiry.

14

Decomposing response inhibition: a POMDP model

Wang, W.; Kaufmann, T.; Dayan, P.

2026-03-02 neuroscience 10.64898/2026.02.26.708416 medRxiv

Top 0.1%

26.2%

Show abstract

Inhibition is a core cognitive control function whose competence is distributed across the population, with more extreme impairments in psychiatric conditions such as attention deficit hyperactivity disorder (ADHD). The Stop Signal Task (SST) is a widely used paradigm for assessing this ability. However, conventional formalizations of SST performance, such as the independent race model, rely on assumptions that are frequently violated in modern experimental designs. Furthermore, the typical focus is on fitting mean reaction times, overlooking trial-by-trial dynamics. To address these limitations, we model the SST as a partially observable Markov decision process. This framework characterizes inhibitory control through distinct components: noisy perceptual inference regarding stimuli, and optimal control balanced against potential costs. To assess the ability of the model to capture the distribution of inhibitory capacities, we fit it to data from the large Adolescent Brain Cognitive Development (ABCD) study baseline cohort (N = 5,114). To do this, we adapted Simulation-Based Inference with a transformer-based encoder. This architecture learns compact, sequence-aware embeddings from raw behavioral data. These embeddings enable amortized inference of individual-level parameter posteriors in an efficient and reliable end-to-end manner, as confirmed by extensive validation. We identified distinct computational phenotypes associated with ADHD traits. Children with higher ADHD scores exhibited greater directional imprecision, a diminished intrinsic penalty for inhibition failures, and a more deterministic response style. Notably, the learned embedding space reveals a continuous manifold where children with the higher ADHD scores are heterogeneously distributed, rather than forming distinct disorder clusters. This indicates that similar clinical traits can emerge from diverse combinations of computational mechanisms, supporting a dimensional perspective on neurodiversity. Our framework can be extended to a broader range of cognitive tasks, offering a scalable solution for fitting complex models to large-scale behavioral data. Author summaryInhibitory control is essential for adjusting thoughts and behavior and is often impaired in conditions like ADHD. Traditional models of the Stop Signal Task (SST) often oversimplify the complex decision-making involved. We formalized these cognitive processes using a more biologically grounded framework (POMDP). This approach separates perceptual processing from control adjustments and remains valid in diverse experimental designs where traditional models fail. To apply the model at scale, we developed a specialized machine learning approach (TeSBI). This allowed us to efficiently reverse-engineer individual cognitive profiles. Applying it to the ABCD dataset (which includes more than 5,000 children), we found that higher ADHD scores are linked to specific computational deficits: noisy sensory processing, a lack of concern for errors, and a deterministic response style. Crucially, children with higher ADHD scores did not form a single disorder cluster but displayed diverse cognitive combinations, supporting a dimensional view of neurodiversity. Our results show that our model effectively captures complex inhibition mechanisms. By combining theory-driven cognitive modeling with scalable data-driven inference, this framework enables the precise analysis of large-scale behavioral datasets. This paves the way for more personalized approaches in computational psychiatry by recognizing the heterogeneity within clinical traits.

15

Beyond years of schooling: Shifting genetic influences across educational milestones in two Norwegian cohorts

Kvalvik, E. H.; Wang, Y.; Walhovd, K. B.; Lyngstad, T. H.; Rogeberg, O.

2025-10-09 genetics 10.1101/2025.10.08.680992 medRxiv

Top 0.1%

26.2%

Show abstract

Although educational attainment is heritable, its conventional measurement in genetic research as years of education (EduYears) is not designed to reveal potential stage-specific genetic influences across discrete milestones. In two Norwegian cohorts (Norwegian Mother, Father and Child Cohort Study, N = 120,527; Norwegian Twin Registry, N = 8,910), we quantified the genetic contributions to completing high school, bachelors, masters and PhD using genome-wide association studies (GWAS), polygenic indices (PGIs) and twin models. Transition-specific analyses, conditioning on prior success, revealed that observed-scale common-variant heritability (h2SNP) and PGI predictability followed an inverse-U pattern, peaking at the transition into higher education (h2SNP {approx} 0.14; R2Tjur {approx} 0.05) before declining for postgraduate degrees. Genetic correlations (rg) with large-scale GWAS of EduYears (EA4) and intelligence (IQ3) were high for early transitions but declined markedly for later ones (e.g., rg with EA4 from {approx} 0.92 to {approx} 0.38). In cumulative analyses, aggregating liability across prior milestones, the gap between twin- and SNP-based heritability narrowed at higher levels of attainment (h2twin {approx} 0.6[->]0.3; h2SNP {approx} 0.22[->]0.19), while the genetic overlap between distant milestones diminished (rg {approx} 0.92[->]0.71). These patterns, obscured by EduYears metrics, highlight a dynamic genetic architecture across educational milestones, refining polygenic prediction and addressing misconceptions about uniform genetic influences on educational progression.

16

Replicable multivariate BWAS with moderate sample sizes

Spisak, T.; Bingel, U.; Wager, T. D.

2022-06-26 neuroscience 10.1101/2022.06.22.497072 medRxiv

Top 0.1%

25.7%

Show abstract

Brain-Wide Association Studies (BWAS) have become a dominant method for linking mind and brain over the past 30 years. Univariate models test tens to hundreds of thousands of brain voxels individually, whereas multivariate models ( multivariate BWAS) integrate signals across brain regions into a predictive model. Numerous problems have been raised with univariate BWAS, including lack of power and reliability and an inability to account for pattern-level information embedded in distributed neural circuits1-3. Multivariate predictive models address many of these concerns, and offer substantial promise for delivering brain-based measures of behavioral and clinical states and traits2,3. In their recent paper4, Marek et al. evaluated the effects of sample size on univariate and multivariate BWAS in three large-scale neuroimaging dataset and came to the general conclusion that "BWAS reproducibility requires samples with thousands of individuals". We applaud their comprehensive analysis, and we agree that (a) large samples are needed when conducting univariate BWAS of individual differences in trait measures, and (b) multivariate BWAS reveal substantially larger effects and are therefore more highly powered. However, we disagree with Marek et al.s claims that multivariate BWAS provide "inflated in-sample associations" that often fail to replicate (i.e., are underpowered), and that multivariate BWAS consequently require thousands of participants when predicting trait-level individual differences. Here we substantiate that (i) with appropriate methodology, the reported in-sample effect size inflation in multivariate BWAS can be entirely eliminated, and (ii) in most cases, multivariate BWAS effects are replicable with substantially smaller sample sizes (Figure 1). O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=198 SRC="FIGDIR/small/497072v1_fig1.gif" ALT="Figure 1"> View larger version (44K): org.highwire.dtl.DTLVardef@180764borg.highwire.dtl.DTLVardef@d64a2forg.highwire.dtl.DTLVardef@a0865aorg.highwire.dtl.DTLVardef@d4b14c_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOFigure 1.C_FLOATNO Multivariate BWAS provide unbiased effect sizes and high replicability with low-moderate sample sizes. (a) In-sample effects in multivariate BWAS are only inflated if estimates are obtained without cross-validation. (b) Cross-validation fully eliminates in-sample effect size inflation and, as a consequence, provides higher replicability. Each point in (a) and (b) corresponds to one bootstrap subsample, as in Fig. 4b of Marek et al. Dotted lines denote the threshold for p=0.05 with n=495. (c) The inflation of in-sample effect size obtained without cross-validation (red) is reduced, but does not disappear, at higher sample sizes. Conversely, cross-validated estimates (blue) are slightly pessimistic with low sample sizes and become quickly unbiased as sample size is increased. (d) Without cross-validation, in-sample effect size estimates are non-zero (r{approx}0.5, red) even when predicting permuted outcome data. Cross-validation eliminates systematic bias across all sample sizes (blue). Dashed lines in (c) and (d) denote 95% parametric confidence intervals, and shaded areas denote bootstrap and permutation-based confidence intervals. (e-f) Cross-validated analysis reveals that sufficient in-sample power (e) and out-of-sample replication probability (P(rep)) (f) can be achieved for a variety of phenotypes at low or moderate sample sizes. 80% power and P(rep) are achievable in <500 participants for half the phenotypes tested (colored bars) using the prediction algorithm in Marek et al. (top panels in (e) and (f), sample size required for 80% power or P(rep) shown). Other phenotypes require sample sizes >500 (bars with arrows). Power and P(rep) can be substantially improved with a ridge regression-based model recommended in some comparison studies10,11 (bottom panels in (e) and (f)), with 80% power and P(rep) with sample sizes as low as n=100 and n=75, respectively, when predicting cognitive ability, and sample sizes between 75 and 375 for other investigated variables, except inhibition assessed with the flanker task. (g) We estimated interactions between sample size and publication bias by computing effect size inflation (rdiscovery - rreplication) only for those bootstrap cases where prediction performance was significant (p>0.05) in the replication sample. Our analysis shows that the effect size inflation due to publication bias is modest (<10%) with <500 participants for half the phenotypes using the Marek et al. model and all phenotypes but the flanker using the ridge model. C_FIG

17

MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis

Xu, T.; Zhou, T.; Wang, Y.; Yang, P.; Tang, S.; Shao, K.; Tang, Z.; Liu, Y.; Chen, X.; Wang, H.; Wang, X.; Luo, H.; Wang, J.; Hu, J.; Yu, J.

2025-04-01 animal behavior and cognition 10.1101/2025.03.27.645630 medRxiv

Top 0.1%

25.7%

Show abstract

Analyzing animal behavior is crucial in advancing neuroscience, yet quantifying and deciphering its intricate dynamics remains a significant challenge. Traditional machine vision approaches, despite their ability to detect spontaneous behaviors, fall short due to limited interpretability and reliance on manual labeling, which restricts the exploration of the full behavioral spectrum. Here, we introduce MouseGPT, a Vision-Language Model (VLM) that integrates visual cues with natural language to revolutionize mouse behavior analysis. Built upon our first-of-its-kind dataset--incorporating pose dynamics and open-vocabulary behavioral annotations across over 42 million frames of diverse psychiatric conditions--MouseGPT provides a novel, context-rich method for comprehensive behavior interpretation. Our holistic analysis framework enables detailed behavior profiling, clustering, and novel behavior discovery, offering deep insights without the need for labor-intensive manual annotation. Evaluations reveal that MouseGPT surpasses existing models in precision, adaptability, and descriptive richness, positioning it as a transformative tool for ethology and for unraveling complex behavioral dynamics in animal models.

18

Taste matters: Mapping expectancy-based appetitive placebo effects onto the brain

Khalid, I.; Rodrigues, B.; Dreyfus, H.; Frileux, S.; Meissner, K.; Fossati, P.; Hare, T. A.; Schmidt, L.

2023-02-14 neuroscience 10.1101/2023.02.14.527858 medRxiv

Top 0.1%

25.4%

Show abstract

Expectancies, which are higher order prognostic beliefs, can have powerful effects on experiences, behavior and brain. However, it is unknown where, how, and when, in the brain, prognostic beliefs influence appetitive interoceptive experiences and related economic behavior. This study combined a placebo intervention on hunger with computational modelling and functional magnetic resonance imaging of value-based decision-making. The results show that prognostic beliefs about hunger shape hunger experiences, how much participants value food and food-value encoding in the prefrontal cortex. Computational modelling further revealed that these placebo effects were underpinned by how much and when during the decision process taste and health information are integrated into the accumulation of evidence toward a food choice. The drift weights of both sources of information further moderated ventromedial and dorsolateral prefrontal cortex interactions during choice formation. These findings provide novel insights into the neurocognitive mechanisms that translate higher order prognostic beliefs into non-aversive interoceptive sensitivity and shape decision-making.

19

The relationship of major diseases with childlessness: a sibling matched case-control and population register study in Finland and Sweden

Liu, A.; Akimova, E. T.; Ding, X.; Jukarainen, S.; Vartiainen, P.; Kiiskinen, T.; Kuitunen, S.; Havulinna, A. S.; Gissler, M.; Lombardi, S.; Fall, T.; Mills, M. C.; ganna, a.

2022-04-02 sexual and reproductive health 10.1101/2022.03.25.22272822 medRxiv

Top 0.1%

23.3%

Show abstract

The percentage of women born 1965-1975 remaining childless is [~]20% in many Western European and [~]30% in some East Asian countries. Around a quarter of childless women do that voluntary, suggesting a remaining role for disease. Single diseases have been linked to childlessness, mostly in women, yet we lack a comprehensive picture of the effect of early-life diseases on lifetime childlessness. We examined all individuals born 1956-1968 (men) and 1956-1973 (women) in Finland (n=1,035,928) and Sweden (n=1,509,092) to completion of reproduction in 2018 (age 45 women; 50 men). Leveraging nationwide registers, we associated sociodemographic and reproductive information with 414 diseases across 16 categories, using a population and matched pair case-control design of siblings discordant for childlessness (71,524 full-sisters, 77,622 full-brothers). The strongest associations were mental-behavioural, particularly amongst men (schizophrenia, acute alcohol intoxication), congenital anomalies and endocrine-nutritional-metabolic disorders (diabetes), strongest amongst women. We identified novel associations for inflammatory (e.g., myocarditis) and autoimmune diseases (e.g., juvenile idiopathic arthritis). Associations were dependent on age at onset, earlier in women (21-25 years) than men (26-30 years). Disease association was mediated by singlehood, especially in men and by educational level. Evidence can be used to understand how disease contributes to involuntary childlessness. O_TEXTBOXText box:Defining Childlessness We use the term childlessness to describe individuals that have had no live-born children by the end of their reproductive lifespan (age 45 for women; 50 for men). Childlessness is defined in the literature as being both involuntary, related to biology and fecundity (e.g., infertility, inability to find a partner) and voluntary or childfree1 (e.g., active choice, preference2). It has been estimated that 4-5% of the current 15-20% women who are childless in Europe are voluntary childless3. Childless individuals are subjected to discrimination and marginalization in many societies4, with infertile women globally experiencing multiple types of violence and coercion5. A parallel line of work, which is not the position of this paper or authors, is to problematize and stigmatize childless individuals as egoistic and place blame on this group for producing a so-called demographic disaster of shrinking and ageing populations and collapse of social security systems6. The approach of this paper is to provide a neutral, data-driven, and factual examination of early-life diseases related to childlessness, with the aim to design a better understanding of health to prevent childlessness among those who want to have children. C_TEXTBOX

20

Patterns of genAI bias in guiding prospective undergraduate students: a study of UK neuroscience programmes

Potter, H. G.

2026-03-24 scientific communication and education 10.64898/2026.03.20.713226 medRxiv

Top 0.1%

23.0%

Show abstract

Generative artificial intelligence (genAI) tools are increasingly used by prospective higher education (HE) applicants seeking guidance on university and programme selection. Despite rapidly expanding use, little is known about how genAI systems may introduce or amplify bias in undergraduate admissions decision-making. Here, we systematically examined patterns of bias across three widely used genAI chatbots (ChatGPT, Copilot, Gemini) using neuroscience as a representative UK undergraduate programme. We constructed 216 prompts that varied by applicant characteristics (e.g. gender, study type, academic attainment). Each prompt was submitted to all three chatbots, generating 648 responses and 3240 individual programme recommendations. Output responses underwent text analysis (e.g. n-grams, gender-coded language), and national HE markers of esteem (REF21, TEF23, NSS24) were analysed. Applicant grades and priorities produced the strongest effects on genAI outputs. Higher-grade applicants and those prioritising research received significantly more masculine-coded language, independent of applicant gender. N-gram patterns also diverged: high-grade prompts more frequently elicited terms relating to excellence and research intensity, whereas lower-grade prompts produced greater emphasis on widening access. Recommendations were systematically skewed, with higher grades, private schooling, and research-focused priorities increasing the likelihood of recommending elite institutions and programmes with higher entry requirements. Critically, the gender-coded language of outputs predicted institutional characteristics: masculine-coded responses were associated with recommendations featuring higher entry thresholds and stronger research performance, while feminine-coded responses favoured institutions with higher student satisfaction. These findings reveal clear, systematic biases in how genAI guides prospective HE applicants. Such biases risk reinforcing existing educational and socioeconomic inequalities, underscoring the need for transparency, regulation, and oversight in the use of genAI within HE decision-making. HighlightsO_LIGenAI is widely used by HE applicants despite little study of its biases. C_LIO_LI216 prompts across 3 chatbots generated 3240 programme suggestions. C_LIO_LIGrades and priorities drove major shifts in language and recommendations. C_LIO_LIGender-coded wording mapped onto research strength and entry standards. C_LIO_LIGenAI biases may reinforce inequalities in HE admissions decision-making. C_LI